Inverse resolution of spatially varying diffusion coefficient using Physics-Informed neural networks

Resolving the diffusion coefficient is a key element in many biological and engineering systems, including pharmacological drug transport and fluid mechanics analyses. Additionally, these systems often have spatial variation in the diffusion coefficient which must be determined, such as for injectable drug-eluting implants into heterogeneous tissues. Unfortunately, obtaining the diffusion coefficient from images in such cases is an inverse problem with only discrete data points. The development of a robust method that can work with such noisy and ill-posed datasets to accurately determine spatially-varying diffusion coefficients is of great value across a large range of disciplines. Here, we developed an inverse solver that uses physics informed neural networks (PINNs) to calculate spatially-varying diffusion coefficients from numerical and experimental image data in varying biological and engineering applications. The residual of the transient diffusion equation for a concentration field is minimized to find the diffusion coefficient. The robustness of the method as an inverse solver was tested using both numerical and experimental datasets. The predictions show good agreement with both the numerical and experimental benchmarks; an error of less than 6.31% was obtained against all numerical benchmarks, while the diffusion coefficient calculated in experimental datasets matches the appropriate ranges of other reported literature values. Our work demonstrates the potential of using PINNs to resolve spatially-varying diffusion coefficients, which may aid a wide-range of applications, such as enabling better-designed drug-eluting implants for regenerative medicine or oncology fields.


Introduction
Understanding the spatial variation of the diffusion coefficient is a key element in many biological and engineering systems (e.g., drug transport, biomedical transport, petroleum engineering, and hydrogeology) [1,2].The diffusion coefficient is a proportionality constant which relates the gradient in concentration of a species with the mass flux and gives a measure of the rate of diffusion.However, there are numerous hurdles when attempting to resolve the diffusion coefficient from experimental data.First, the diffusion coefficient often varies spatially, sometimes discontinuously, due to heterogeneity in biological tissues, porous media, and soil.Second, determining the diffusion coefficient from experimental images is an inverse problem, where the parameters of the system need to be determined from the given observations.Numerical instability is a challenge for such problems, which are fundamentally ill-posed in the Hadamard sense [3].Even a small perturbation in experiments can significantly change the estimated diffusion coefficient [4].Additionally, data is always collected at discrete points, such that any errors in observation will be amplified in the parameters to be determined [5].Moreover, multiple models can fit the data and uniqueness of the solution for the inverse problem is not guaranteed.A robust method which can work with noisy data to accurately determine the diffusion coefficient has wide-ranging applications and is of great value across a large range of disciplines.Many numerical methods replace the original inverse problem by linearization around constant coefficients in order to solve a simpler problem [6].For example, numerical methods such as finite differences in combination with least squares [7], genetic algorithm with sinc-Galerkin method [8], as well as Bayesian methods [9,10] have been used to estimate the diffusion coefficient for the observed data.However, obtaining numerical solutions that are satisfactory using such methods is often difficult [6].
Neural networks are powerful computational tools that have the capacity to represent complex linear and non-linear systems [11].There is a growing interest in using neural networks to extract latent information from data in engineering fields like fluid mechanics [12] and material engineering [13].Machine learning techniques have been used to obtain corrective terms for improving closure models in computational physics through inverse modelling of high-fidelity simulation and experimental data [14,15].There is a growing interest in using neural networks as a tool for better understanding biological systems such as studying membranes in fluorescence microscopy [16] and analysis of molecular images [17].Although traditional neural networks thrive on data, high-fidelity computational and experimental data is scarce and expensive when it comes to many engineering and biological systems.Raissi et al. [18,19,20] have introduced physics informed neural networks (PINNs), which are neural networks that respect physical laws described by partial differential equations (PDEs) while performing supervised learning tasks.PINNs are extremely data efficient neural networks that can be used to solve PDEs (forward problems) or to discover the coefficients of terms in the PDE given the solution (inverse problems) [21].PINNs have offered a novel solution methodology to diverse problems like brittle fracture mechanics [22], learning constitutive relations for a non-stationary bioreactor [23], modelling the thermochemical curing process of composite-tool systems during manufacture [24], predicting arterial blood pressure from 4D MRI data [25], and modelling soft tissues [26].The methodology has also been used to solve inverse problems like reconstructing the velocity and pressure fields from flow visualizations [27] and discovering turbulence models from noisy spatio-temporal data [28].The ability of PINNs to work with noisy data makes it a promising option to solve inverse problems using experimental obser-vations.
Our focus in this work was to employ PINNs to solve the inverse problem of discovering the spatially varying diffusion coefficient from spatio-temporal information of the diffused passive scalar.Our method was subjected to both numerical and experimental data sets.We first used clean numerical data for verification of our method and then worked with three distinct experimental datasets which contained noise.These results demonstrated that we were able to accurately resolve the diffusion coefficient in different liquids and hydrogels.Additionally, the framework presented is robust and the method can be modified to tackle inverse problems of similar nature with minor modifications.In total, we conclude that this method may be an ideal platform to predict the properties of many substrates in biological systems based on spatially-varying diffusivity, with numerous engineering applications.

Problem definition and methodology
Our objective was to solve the inverse problem of determining the spatially varying diffusion coefficient which best described a given set of spatio-temporal data of the concentration of a passive scalar using PINNs.To do so, we employed the use of fully connected neural networks.Fully connected neural networks, or feed-forward neural networks, have the simplest architecture where the output of one layer of the neural network is used as the input to the next layer.For a network with an input layer, an output layer, and H hidden layers, the k th hidden layer will receive an output from the previous layer (x k−1 ) and perform the affine transform first where w k and b k are the weights and biases associated with the k th hidden layer.Before sending this output as the input to the next layer, a nonlinear activation function σ is applied.For an input x, a fully connected neural network can be mathematically written as where • is the composition operator and θ represents the trainable weights and biases of the network which are optimized.The key idea of physics informed networks is to constrain the neural network such that the residual of the pertinent partial differential equation is minimized.Consider the transient diffusion equation for a concentration field of a passive scalar c(x, y, t) in two dimensions with a spatially varying diffusion coefficient This is a parabolic equation, analogous to the equations pertinent to problems in heat transfer and the same framework can be used to determine thermal properties using temperature measurements.We approximate the functions c(x, y, t) and D(x, y) using two deep fully connected neural networks.The only known data are point clouds inside the domain {x, y, t, c(x, y, t)}.Given this spatio-temporal data, we want to infer the spatially varying diffusion coefficient D(x, y).To solve this inverse problem, we designed a PINN to approximate the diffusion coefficient D(x, y) which best describes the concentration field c(x, y, t) data.Now, considering (3), we define where the subscripts denote the derivatives.We now create a physics-informed neural network using backward Euler discretization.We define where θ and ϕ are the parameters for the neural networks used to approximate c(x, y, t) and D(x, y), respectively.As shown in Fig. 1, to create our PINN we obtained c pi corresponding to equation (3) using the two neural networks defined above.The identity operator is represented using I and the differential operators are represented using ∂x, ∂y and ∂t in Fig. 1.The derivatives which do not appear in the equations will not be a part of the computational graph and will not contribute to any computational costs.Another challenge associated with this inverse problem worth emphasizing is that regions where the concentration field does not vary, or the gradient is zero, gives no information for us to learn and determine the diffusion coefficient.It is noted that automatic differentiation provides us with derivatives calculated with machine precision.We are working with point clouds and just one global function and there is no discretization involved.This is significant as a lattice like distribution of points is not a requirement and we can work with any distribution of points in the spatiotemporal domain.The popular open-source library Tensorflow [29] was used to create and compile the computational graphs for our physics-informed neural network.The parameters for neural networks c(x, y, t) and D(x, y) were optimized by minimizing the following mean squared error (M SE Dif f ) function: Here σc is the standard deviation of the concentration, {x i , y i , t i } N i=1 denote the points where we have information on c(x, y, t) and the number of such points is N , {x i , y i , t i } N e i=1 denote the points where the equation is evaluated on and the number of such points is N e .The second term in eq. ( 6) is the consistency loss, which ensures that there is consistency in the regressed concentration and the physics informed concentration [30].Both neural networks c(x, y, t) and D(x, y) were densely connected with 8 hidden layers each.The network (Fig. 1) was trained with a decaying schedule for the learning rate using the Adam optimizer [31].

Results and discussion
To evaluate the accuracy of our method, the predicted values for concentrations and diffusion coefficients were compared with both data generated from numerical simulations (Section 3.1) and experimentation (Section 3.2).

Numerical Simulations Data
We considered four different spatially varying diffusion coefficients (referred to as Dtrue) as our test cases and numerically solved the diffusion equation with a spatially varying diffusion coefficient (Eq.( 3)) to get the concentration field of the passive scalar (referred to as ctrue).The PINN framework described in section 2 was then used to determine the spatially varying diffusion coefficient that best described the concentration field distribution.Numerical simulations were carried out using the open-source CFD toolbox OpenFOAM [32] with a 500×500 mesh (x, y ∈ (0, 1)) and 100 time steps (t ∈ (0, 1)).The mesh point generated from the simulation were used as training points for the PINNs.For all four cases, 5000 points were randomly sampled across time and space from inside the domain for each iteration using the data iterator provided by Tensorflow [29].
For all cases, we considered a square domain and defined the side of the square to be the length scale L and the total time to be the time scale T. The concentration field was scaled using the maximum value of the concentration, hence the dimensionless field c ∈ [0, 1].We defined x = x ′ L , y = y ′ L and t = t ′ T where x ′ , y ′ and t ′ are the dimensional x-coordinate, y-coordinate and time, respectively while x, y and t are the dimensionless x-coordinate, y-coordinate and time, respectively.The diffusion coefficient distribution was scaled by T L 2 to get a dimensionless diffusion coefficient distribution D(x, y).The parameters of the networks were updated using the gradient of the loss with respect to the network parameters to get the optimum set of parameters for the mapping (x, y, t) −→ (c) and (x, y) −→ ( D).For the given concentration distribution, the networks used regression to learn the concentration distribution c(x, y, t) and minimized the residual defined in equation ( 5) to determine the diffusion coefficient distribution D(x, y) which best described the concentration data.
In case I, we simulated the concentration of a passive scalar corresponding to the spatially varying diffusion coefficient Dtrue(x, y) This resulted in a smooth function akin to a two-dimensional parabola.To compare the results predicted by the neural networks to the true value and simulation results, we defined the relative error to be where the bar denotes the mean value.This definition for error is used so that multiplication or addition of a constant does not change the error.In case II, we considered a more complicated two dimensional sinusoidal distribution Dtrue(x, y) := 0.25 + 0.1[sin(2πx) + sin(2πy)] and for case III, we increased the frequency of the si-    The same hyperparameters were used for all cases.Figure 2 shows a snapshot of the results from the network at t = 0.5T for all the four cases.The first column shows the exact value of the concentration (c(x, y, 0.5T )) and the second column shows the value predicted by the neural network (c(x, y, 0.5T )).The third column in figure 2 shows the true value of the diffusion coefficient (D(x, y)) while the fourth column shows the value predicted by the neural network ( D(x, y)).The relative error between the learned concentration and diffusion coefficient distributions and the true values from the simulation are summarized in Table 1.Our method learned the diffusion coefficient reliably in all cases, with the relative error being below 6.31 × 10 −2 in all cases.Figure 3 compares the exact diffusion coefficient and the diffusion coefficient predicted by the neural network in all the four cases along the horizontal cross-section at the midpoint.To test the robustness of our method to sparse data, we reduce the number of spatial points as observations to the concentration field systematically for case 1.We look at 250000, 25000, 2500, 250 and 25 spatial data points and the results are shared in table 2. The error in the diffusion coefficient only marginally increases till the number of points are reduced to 250.The solver finally breaks down when only 25 spatial points or "pixels" are considered.This demonstrates that our framework is not sensitive to resolution of data.With the method performing well with noiseless data, we next considered images from experiments.

Experimental Data
To verify our algorithm on experimental data, we obtained time lapse images of Rhodamine 6G dye diffusing through (1) two liquids (Section 3.2.1),( 2) agarose, and (3) gelatin hydrogels (Section 3.2.2).The intensity of the images was assumed to be proportional to the concentration of the dye for our calculations.Working with experimental data is significantly more challenging compared to numerical data as various sources of noise contribute to perturbations in the observations.These perturbations can significantly change the estimated diffusion coefficient [4].Neural networks can be effective tools for processing the images and mitigating noise effects.A neural network can be trained on a regression task over the concentration data and automatic differentiation can be used to visualize the temporal and spatial derivatives.This helps in identification and removal of aberrations and helps in understanding the data better.As part of processing the images, the space and the time domains were non-dimensionalized using an appropriate length and time scale.

Diffusion through Liquid
The setup, shown in Fig. 4(a), was designed to investigate the diffusion coefficients of different liquid solutions.A square cuvette [Polymethyl methacrylate (PMMA) cuvette] with dimensions of 1.25×1.25×4.5 cm was filled with two layers of solutions: DI water (liquid 1) as an upper layer and a glycerine-water mixture (20% wt, liquid 2) in a lower layer.The water solution on top contained 100 µM Rhodamine 6g dye (Sigma-Aldrich, Rhodamine 6G, dye content 99%, C28H31N2O3Cl, molecular weight 479.01 g/mol, SKU 252433) and over time, diffusion occurred between the two layers.
A camera (Imperx, CLM-B6640M-TF000) was used to capture the diffusion every minute with a resolution of 6640×4400 pixels.The images were analyzed using a custom Matlab algorithm to measure the dye intensity in each frame.The setup was placed inside a photobox to prevent ambient light interference and the temperature was maintained at 23 • Celsius.Different values have been reported for the diffusion coefficient of Rhodamine 6G [33,34].Here we used the following viscosity values of µ1 = 9.35 × 10 −4 Ns/m 2 and µ2 = 1.60 × 10 −3 Ns/m 2 for the respective solutions and calculated the diffusion coefficient at room temperature (using Stokes-Einstein equation) as D1 = 4.05 × 10 −10 m 2 /s and D2 = 2.36 × 10 −10 m 2 /s [33,34,35].For this case, the neural network prediction for the spatially varying diffusion coefficient is shown in Fig. 4 (b).We have used a cuvette to look at the diffusion between liquids.The images were then captured using a camera and then processed through a solver.We believe that the curvature has been caused as the interface between the two fluids isn't perfectly flat and due to the distortion effects of approximating the three-dimensional cuvette using two-dimensional images.Two distinct regions are visible in the plot and the diffusion coefficient from the neural network has the value 4.0×10 −10 m 2 /s and 2.4×10 −10 m 2 /s for the water and glycerine-water regions, respectively, which are in good agreement with the calculated ground truth values and with previously-reported literature using numerous techniques [33,34,35].

Diffusion through Gels
Rectangular prisms of either gelatin or agarose hydrogels were fabricated such that varied concentrations of Rhodamine 6G dye in phosphate buffered saline (PBS) would diffuse inward from each side, mimicking the computational setup.A custom computeraided design (CAD) part was drawn in Autodesk Inventor and fabricated using Flexible Resin V1 (RS-F2-FLGR-02) on the Formlabs Form 3 SLA 3D printer.The part was made to fit the diameter of the CELLSTAR 35x10mm cell culture petri dish (Cat.No. 627-160), segmenting the dish into four equal reservoirs with an open container in the center where the hydrogel was situated.In both experiments, the hydrogels were 4 mm 2 by 5 mm tall rectangular prisms, dictated by the container size of the 3D printed part.The windows on the sides of the gel that contacted the PBS/dye baths were 3 mm 2 squares.Directly before the kinetic experiment began, different volumes of a 1 mM Rhodamine 6G stock solution were added into the different PBS reservoirs, resulting in the source dye concentrations shown in Fig. 4(c,e).Brightfield and red fluorescence images (BioTek Cytation 5 RFP, 531, 593 filter cube and LED cube set) were captured on the BioTek Cytation 5 using the Gen5 software with the BioTek 4x Plan Fluorite phase objective (BioTek, 1320515).
In the agarose experiments, 2.5% wt/vol agarose (Fisher BioReagents, BP160500) in PBS was flooded into the hydrogel container while a plastic cylinder, 1.5 mm in diameter, was positioned in the center of the container as a place-holder.Once the 2.5% agarose gelled completely, the cylinder was removed and 0.5% wt/vol agarose was transferred into the resulting hole and allowed to solidify (Fig. 4(c)).Brightfield and red fluorescent images were obtained every minute.
For gelatin experiments, gelatin from porcine skin, gel strength 300, Type A (Sigma Life Sciences, G2500-500G) was used at 5% and 15% wt/vol.Because of the nature of gelatin, the same fabrication technique could not be used as above.Instead, 15% gelatin in PBS was drawn up into the luer slip tip of a 1 mL syringe (approximately 2.2 mm in diameter) (Henke Sass Wolf, 4010-200V0) and allowed to gel.The gelatin cylinder was then placed into the 3D printed part and 5% gelatin was flooded around the rest of the container and allowed to gel (Fig. 4(e)).Due to the slower diffusion through gelatin which required longer kinetic runs, brightfield and red fluorescent images were obtained every 2.5 minutes.
In both hydrogel cases, the neural network prediction was able to identify two distinct regions of spatially-varying diffusion coefficients corresponding to the various formulations of the hydrogels.Within the agarose gel, values of approximately 2.0 × 10 −10 m 2 /s and 9.0 × 10 −10 m 2 /s were obtained for the 2.5% and 0.5% wt/vol agarose regions, respectively (Fig. 4(d)).Similarly, the values for the diffusion coefficients in the two gelatin regions were roughly 2.4 × 10 −10 m 2 /s and 3.7 × 10 −10 m 2 /s for the 15% and 5% wt/vol gelatin regions, respectively (Fig. 4(f)).Because our experimental setup introduces a varied concentration gradient of Rhodamine 6G at each interface of the hydrogel, the obtained diffusion coefficients will not be identical to literature values that use saturated gels, such as the common Fluorescence Recovery After Photobleaching (FRAP) technique.However, we observed the expected trend where both the 2.5% and 15% gels had lower diffusion coefficient values than their less-dense counterparts.Additionally, the range of values obtained is consistent with those found in the literature for similar setups.Golmohamadi et al. found the diffusion coefficient of Rhodamine 6G to be approximately 2.5 − 3 × 10 −10 m 2 /s at neutral pH in 1.5% agarose gels [36].Samprovalaki et al. similarly found the diffusion coefficient of Rhodamine 6G to be around 2.89 − 4.88 × 10 −10 m 2 /s at 30 degree Celsius in 1.5% agar gel depending on the concentration of Rhodamine 6G used [37].Within gelatin, Stucchi et al. observed a diffusion coefficient for Rhodamine 6G of 4 × 10 −11 m 2 /s for 5% gelatin hydrogels.However, these hydrogels were crosslinked using diethyl squarate to increase the gel density and other mechanical properties, which would result in a slower rate of diffusion than the unmodified gelatin used here [38].Additionally, the diffusion maps in Fig. 4(d,f) were able to capture the varied boundary qualities between the agarose and gelatin samples.The agarose results indicate a more jagged boundary between the gel concentrations, likely resulted from the removal of the plastic cylinder after the 2.5% agarose had gelled around it.This is not apparent with the gelatin results (Fig. 4(d)) after the fabrication method was altered such that the cylinder of 15% gelatin was extruded separately, placed within the custom-printed holder, and flooded with 5% gelatin.The jagged lines that we observe are due to the nature of the experiment, where distinct gels were artificially connected to mimic spatially varying diffusion.Hence, the boundaries are not smooth and lead to sudden variation in the diffusion coefficient.We don't expect to see such variations where anomalous diffusion is naturally present and the underlying structure is inherently heterogenous.Taken together, these results provide strong evidence that the algorithm presented is accurate for several types of experimental data that may utilize distinct materials and hardware to capture rates of diffusion.

Conclusion
The diffusion coefficient often varies spatially in biological tissues, porous media, and soil.In this work, we used a physics informed neural network (PINN) to solve the inverse problem of discovering the spatially varying diffusion coefficient from spatiotemporal information on the diffused passive scalar.The results of the framework were first validated and the accuracy of the method was tested using noiseless, numericallygenerated data.Inverse problems are notoriously difficult to solve as they are often ill-posed and small perturbations can significantly change the estimated diffusion coefficients.During experiments, multiple sources of noise create such perturbations in the recorded data.To demonstrate the robustness of the framework, we then considered experimental data as Rhodamine 6G dye diffused through different liquids and hydrogels.Specifically, for diffusion through liquids, diffusion of the dye was captured from a water layer into a layer of a water-glycerine mixture.For diffusion through gels, Rhodamine 6G dye was diffused through either 2.5% and 0.5% wt/vol heterogeneous agarose gels, or 5% and 15% wt/vol heterogeneous gelatin hydrogels.It was shown that our framework captured the spatial distribution of the diffusion coefficient and the interface of distinct regions well in each test case.Using this methodology can thus allow the prediction of the spatiotemporal distribution and variation inside biological samples, where understanding the heterogeneity is crucial.This has implications to aid numerous biological fields and could allow for more accurate calculation of the diffusion of growth factors or other therapeutics from implanted devices, or contribute to a diagnosis of aberrations such as carcinogenic tumours in tissues using data acquired through medical imaging, as two examples.

Figure 1 :
Figure 1: We use two densely connected neural networks which are 8 layers deep for both the concentration (c) and diffusion coefficient( D).For the purpose of this schematics, the networks are shown to have 2 hidden layers and 5 neurons per hidden layer.

Figure 2 :
Figure 2: Results for the numerically simulated data.The first column shows the exact concentration distribution, the second column the predicted concentration distribution, the third column shows the exact distribution of the diffusion coefficient and the fourth column shows the predicted distribution of the diffusion coefficient for (a) Case I (b) Case II (c) Case III and (d) Case IV.

Figure 3 :
Figure 3: Comparison of the correct (exact) diffusion coefficient and the diffusion coefficient predicted by the neural network along the horizontal cross-section at the midpoint for (a) Case I (b) Case II (c) Case III and (d) Case IV.The exact value is represented by a solid red line while the prediction of the network is shown by a dotted blue line.

Figure 4 :
Figure 4: Experimental setup and results for diffusion of Rhodamine 6G dye.In each setup, the approximate areas used for diffusion analyses are denoted with red boxes.(a) Different liquid solutions in a square cuvette (DI water and glycerine-water mixture).(b) The neural network prediction for the diffusion coefficient for water and water-Glycerine mixture.(c) Initial source concentrations of Rhodamine 6G dye in PBS to diffuse into a 2.5% and 0.5% agarose gel chamber.(d) The neural network prediction for the diffusion coefficient for diffusion through the agarose gel.(e) Initial source concentrations of Rhodamine 6G dye in PBS to diffuse into a 5% and 15% gelatin chamber.(f) The neural network prediction for the diffusion coefficient for diffusion through gelatin the gel.The x and y coordinates were made dimensionless using appropriate length scales.

Table 1 :
Numerically simulated data for the distribution of diffusion coefficient, D true (x, y), and the relative error of the predicted to true concentration field and diffusion coefficient, respectively, for each test case.

Table 2 :
The effect of reducing the number of spatial points on the diffusion coefficient for case 1.